Overview

Getting started with large language models

AI

See Book version here: AI Book

AI in the Popular Press

State of AI

General Overview

If you’re getting started, I recommend you begin at the Substack of Wharton’s Ethan Mollick: One Useful Thing is regularly-updated with practical tips and summaries relevant for a general audience.1

His Substack, now with over 100K subscribers, is the basis for a new book he recently published: Co-Intelligence: Living and Working with AI. This is a must-read for anyone looking for a general introduction to the business reader.

Because AI companies and technologies are changing so rapidly, I’ve lost track of the most up-to-date summary of the various commercial (and free) options. ChatGPT is the most famous, but new ones appear almost daily and it’s no longer possible to keep up.

Mollick’s How to Use AI to Do Stuff: An Opinionated Guide was the best guide as of Dec 2023

This chart is already obsolete after the introduction of Claude 3 and Gemini 1.5

from OneUsefulThing.org

How Smart is ChatGPT

Visual Capitalist makes a nice graphic comparing ChatGPT against various standardized test benchmarks.

Marcus Lu on Visual Capitalist

Stanford’s 2024 AI Index says AI now beats humans at basic task according to the Artificial Intelligence Index Report 2024, published on 15 April by the Institute for Human-Centered Artificial Intelligence at Stanford University in California.

Understanding LLMs

Explaining AI and LLMs

Stephen Wolfram explains What is ChatGPT Doing and Why Does it Work? in a lengthy, step-by-step way. It took me a couple of hours to read – it’s very dense and technical – but the time was well-spent.

The conclusion, how LLMs point the way to a better understanding of human language and thinking, is especially provocative. And don’t miss the Hacker News discussion: Wolfram Alpha and ChatGPT

ChatGPT explained for Normies

Jason Crawford asks “Can submarines swim?” with a good easy-to-understand explanation of the implications of GPT technology. His conclusion: recognize that it’s a tool, with advantages and disadvantages over “thinking” similar to the way submarines are different from swimming.

AI Safety

AI Safety Issues

Stanford researchers say only about half of the sentences generated by the search engines in response to a query could be fully supported by factual citations.2

And see many more arguments against claims that AI is the biggest things ever:

Freddie deBoer Does AI Just Suck? shows examples of how AI-generated art is pretty awful when set side-by-side with the human kind.

If I asked an aspiring filmmaker who their biggest influences were and they answered “every filmmaker who has ever lived,” I wouldn’t assume they were a budding auteur. I would assume that their work was lifeless and drab and unworthy of my time.

Also see Santa Fe Institute @MelMitchell1 Melanie Mitchell How do we know how smart AI systems are? who gives several examples of ChatGPT appearing to do well on some task, only to be defeated when the wording is changed slightly.

In Can Large Language Models Reason? she explains in detail many of the issues raised when asking whether LLMs do more than pattern-matching. Some of the references here seem to show good evidence that, no, these models are not reasoning but are simply good at finding similar examples from their extensive trained data sets.

What AI Can’t Do

What AI Can’t Do

Political orientation

The Politics of ChatGPT

AI Performance

State of AI Report Compute Index tracks the size of public, private and national HPC clusters, as well as the utilization of various AI chips in AI research papers. As a key substrate upon which AI models are trained and run, the size of compute clusters and the popularity of specific chips helps us take a temperature check on the rate of progress of AI systems.

from State of AI Report

From Time (offline) why AI Is unlikely to slow down:

Tasks where AI can beat humans

But AI has a memory wall problem

The rise of LLaMA.cpp was surprising to many. Why can an ordinary MacBook run state-of-the-art models only a bit slower than a mighty A100? The answer is that when you’re only doing a single batch, one token at a time, memory bandwidth is the bottleneck and the gap between the M2’s memory bandwidth and the A100’s is not that large.

History

Also see @omooretweets summary of AI in 2023


Feb 24, 2023: Meta launches LLaMA, a relatively small, open-source AI model.

March 3, 2023: LLaMA is leaked to the public, spurring rapid innovation.

March 12, 2023: Artem Andreenko runs LLaMA on a Raspberry Pi, inspiring minification efforts.

March 13, 2023: Stanford’s Alpaca adds instruction tuning to LLaMA, enabling low-budget fine-tuning.

March 18, 2023: Georgi Gerganov’s 4-bit quantization enables LLaMA to run on a MacBook CPU.

March 19, 2023: Vicuna, a 13B model, achieves “parity” with Bard at a $300 training cost.

March 25, 2023: Nomic introduces GPT4All, an ecosystem gathering models like Vicuna at a $100 training cost.

March 28, 2023: Cerebras trains an open-source GPT-3 architecture, making the community independent of LLaMA.

March 28, 2023: LLaMA-Adapter achieves SOTA multimodal ScienceQA with 1.2M learnable parameters.

April 3, 2023: Berkeley’s Koala dialogue model rivals ChatGPT in user preference at a $100 training cost.

April 15, 2023: Open Assistant releases an open-source RLHF model and dataset, making alignment more accessible.

OpenAI History and Principles

References

Liu, Nelson F., Tianyi Zhang, and Percy Liang. 2023. “Evaluating Verifiability in Generative Search Engines.” arXiv. http://arxiv.org/abs/2304.09848.

Footnotes

  1. The Wall Street Journalprofiled him in Apr 2024, calling him a “power broker” for his newly-found situation as an intermediary between the tech companies and businesses looking to get up to speed on the technology.↩︎

  2. Liu, Zhang, and Liang (2023)↩︎